Supplement to “Ensemble Subsampling for Imbalanced Multivariate Two-Sample Tests”

نویسندگان

  • Lisha Chen
  • Wei Dou
چکیده

In this supplemental article, we provide detailed proofs for the propositions and theorems in the main paper. We write the indicator function of the event A as 1A. Let X1, · · · , Xn and Y1, · · · , Yñ be independent random samples in Rd from unknown distributions F and G, respectively, with corresponding densities f and g with respect to Lebesgue measure. The densities are assumed to be continuous on their supports. The two sample test can be stated as H0 : F = G versus H1 : F 6= G. Denote the two sets of indices Ωx = {1, · · · , n} and Ωy = {n+ 1, · · · ,m}, with m = n+ ñ. We

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Subsampling for Imbalanced Multivariate Two-Sample Tests

Some existing nonparametric two-sample tests for equality of multivariate distributions perform unsatisfactorily when the two sample sizes are unbalanced. In particular, the power of these tests tends to diminish with increasingly unbalanced sample sizes. In this article, we propose a new testing procedure to solve this problem. The proposed test, based on the nearest neighbor method by Schilli...

متن کامل

K-sample subsampling in general spaces: The case of independent time series

The problem of subsampling in two-sample and K-sample settings is addressed where both the data and the statistics of interest take values in general spaces. We focus on the case where each sample is a stationary time series, and construct subsampling confidence intervals and hypothesis tests with asymptotic validity. Some examples are also given, and the problem of optimal block size choice is...

متن کامل

K-sample Subsampling

The problem of subsampling in two-sample and K-sample settings is addressed where both the data and the statistics of interest take values in general spaces. We show the asymptotic validity of subsampling confidence intervals and hypothesis tests in the case of independent samples, and give a comparison to the bootstrap in the K-sample setting.

متن کامل

Tempering by Subsampling

In this paper we demonstrate that tempering Markov chain Monte Carlo samplers for Bayesian models by recursively subsampling observations without replacement can improve the performance of baseline samplers in terms of effective sample size per computation. We present two tempering by subsampling algorithms, subsampled parallel tempering and subsampled tempered transitions. We provide an asympt...

متن کامل

Sample Subset Optimization for Classifying Imbalanced Biological Data

Data in many biological problems are often compounded by imbalanced class distribution. That is, the positive examples may largely outnumbered by the negative examples. Many classification algorithms such as support vector machine (SVM) are sensitive to data with imbalanced class distribution, and result in a suboptimal classification. It is desirable to compensate the imbalance effect in model...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013